AITopics | vocal tract

Collaborating Authors

vocal tract

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Quantification of Tenseness in English and Japanese Tense-Lax Vowels: A Lagrangian Model with Indicator $\theta_1$ and Force of Tenseness Ftense(t)

Ishizaki, Tatsuya

arXiv.org Artificial IntelligenceMar-5-2025

The concept of vowel tenseness has traditionally been examined through the binary distinction of tense and lax vowels. However, no universally accepted quantitative definition of tenseness has been established in any language. Previous studies, including those by Jakobson, Fant, and Halle (1951) and Chomsky and Halle (1968), have explored the relationship between vowel tenseness and the vocal tract. Building on these foundations, Ishizaki (2019, 2022) proposed an indirect quantification of vowel tenseness using formant angles $\theta_1$ and $\theta_{F1}$ and their first and second derivatives, $d^Z_1(t)/dt = \lim \tan \theta_1(t$) and $d^2 Z_1(t)/dt^2 = d/dt \lim \tan \theta_1(t)$. This study extends this approach by investigating the potential role of a force-related parameter in determining vowel quality. Specifically, we introduce a simplified model based on the Lagrangian equation to describe the dynamic interaction of the tongue and jaw within the oral cavity during the articulation of close vowels. This model provides a theoretical framework for estimating the forces involved in vowel production across different languages, offering new insights into the physical mechanisms underlying vowel articulation. The findings suggest that this force-based perspective warrants further exploration as a key factor in phonetic and phonological studies.

radian, vowel, vowel tenseness, (11 more...)

arXiv.org Artificial Intelligence

2503.03681

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.05)
Asia > Japan > Honshū > Tōhoku > Miyagi Prefecture > Sendai (0.04)
Asia > Japan > Honshū > Kantō > Tochigi Prefecture > Utsunomiya (0.04)
(2 more...)

Genre: Research Report > New Finding (0.48)

Technology: Information Technology > Artificial Intelligence (0.93)

Add feedback

Multimodal Segmentation for Vocal Tract Modeling

Jain, Rishi, Yu, Bohan, Wu, Peter, Prabhune, Tejas, Anumanchipalli, Gopala

arXiv.org Artificial IntelligenceJun-22-2024

Accurate modeling of the vocal tract is necessary to construct articulatory representations for interpretable speech processing and linguistics. However, vocal tract modeling is challenging because many internal articulators are occluded from external motion capture technologies. Real-time magnetic resonance imaging (RT-MRI) allows measuring precise movements of internal articulators during speech, but annotated datasets of MRI are limited in size due to time-consuming and computationally expensive labeling methods. We first present a deep labeling strategy for the RT-MRI video using a vision-only segmentation approach. We then introduce a multimodal algorithm using audio to improve segmentation of vocal articulators. Together, we set a new benchmark for vocal tract modeling in MRI video segmentation and use this to release labels for a 75-speaker RT-MRI dataset, increasing the amount of labeled public RT-MRI data of the vocal tract by over a factor of 9. The code and dataset labels can be found at \url{rishiraij.github.io/multimodal-mri-avatar/}.

representation, segmentation, video, (16 more...)

arXiv.org Artificial Intelligence

2406.15754

Country: North America > United States > California > Alameda County > Berkeley (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Voice deepfakes are getting easier to spot

#artificialintelligenceOct-11-2022, 16:25:10 GMT

New research has shown that voice deepfakes are becoming easier to spot as synthetic recreations of real voices, thanks to the anatomy of our vocal tracts. Researchers at the University of Florida have devised a method of simulating images of a human vocal tract's apparent movements (opens in new tab) while a voice clip - real or fake - is played back. Professor of Computer and Information Science and Engineering Patrick Traynor and PhD student Logan Blue wrote that they and their colleagues found that simulations prompted by voice deepfakes weren't constrained by "the same anatomical limitations humans have", with some vocal tract measurements having "the same relative diameter and consistency as a drinking straw". Though scientists are starting to spot voice deepfakes with simulation and anatomical comparison, the risk of an ordinary person being tricked by any deepfake - which could lead to identity theft - remains a problem. Ordinary people don't yet have access to these tools.

deepfake, literacy, voice deepfake, (4 more...)

#artificialintelligence

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Deepfake audio has a tell

#artificialintelligenceOct-4-2022, 13:40:09 GMT

An office worker answers it and hears his boss, in a panic, tell him that she forgot to transfer money to the new contractor before she left for the day and needs him to do it. She gives him the wire transfer information, and with the money transferred, the crisis has been averted. The worker sits back in his chair, takes a deep breath, and watches as his boss walks in the door. The voice on the other end of the call was not his boss. The voice he heard was that of an audio deepfake, a machine-generated audio sample designed to sound exactly like his boss.

audio sample, deepfake, vocal tract, (10 more...)

#artificialintelligence

Industry: Information Technology > Security & Privacy (0.82)

Technology:

Information Technology > Artificial Intelligence > Vision (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.82)

Add feedback

Researchers reveal how they detect deepfake audio – here's how

#artificialintelligenceSep-26-2022, 07:45:13 GMT

audio sample, deepfake, vocal tract, (14 more...)

#artificialintelligence

Industry: Information Technology > Security & Privacy (0.81)

Technology:

Information Technology > Artificial Intelligence > Vision (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.81)

Add feedback

Deepfake audio has a tell and researchers can spot it

#artificialintelligenceSep-25-2022, 03:45:13 GMT

audio sample, deepfake, vocal tract, (11 more...)

#artificialintelligence

Industry: Information Technology > Security & Privacy (0.82)

Technology:

Information Technology > Artificial Intelligence > Vision (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.82)

Add feedback

Harbour seals can learn how to change their voices to seem bigger

New ScientistApr-29-2022, 05:00:50 GMT

Consider the squeak of a mouse and the low rumble of a lion's roar. In the animal kingdom, bigger animals usually produce lower pitch sounds as a result of their larger larynges and longer vocal tracts. But harbour seals seem to break that rule: they can learn how to change their calls. That means they can deliberately move between lower or higher pitch sounds and make themselves sound bigger than they really are. "The information that is in their calls is not necessarily honest," says Koen de Reus at the Max Planck Institute for Psycholinguistics in Nijmegen, Netherlands.

harbour seal, phoca vitulina, vocal gymnastics, (5 more...)

New Scientist

Country: Europe > Netherlands > Gelderland > Nijmegen (0.26)

Genre: Instructional Material > Course Syllabus & Notes (0.62)

Technology: Information Technology > Artificial Intelligence (0.74)

Add feedback

#ICML2021 invited talk round-up 2: randomized controlled trials, encoding speech, and molecular science

AIHubJul-27-2021, 09:06:43 GMT

In this post, we summarise the final three invited talks from the International Conference on Machine Learning (ICML). These presentations covered: how machine learning can complement randomised controlled trials, encoding and decoding speech, and molecular science. Esther's work centres on the use of randomised controlled trials (RCT) and she runs policy experiments with the aim of understanding which policies work and which don't. Her work is particularly focussed on reducing poverty. Work of this type involves many causal questions, for which there are often many competing ideas. Such is the field that there is no real guidance for theory; experiments are needed to determine successful policies.

experiment, molecular science, speech, (14 more...)

AIHub

Country:

Africa > Togo (0.15)
Asia > India > Haryana (0.05)

Genre:

Research Report > Strength High (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (0.98)
Government (0.71)
Health & Medicine > Pharmaceuticals & Biotechnology (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Severely paralyzed man communicates using brain signals sent to his vocal tract

EngadgetJul-15-2021, 11:05:24 GMT

A severely paralyzed man has been able to communicate using a new type of technology that translates signals from his brain to his vocal tract directly into words that appear on a screen. Developed by researchers at UC San Francisco, the technique is a more natural way for people with speech loss to communicate than other methods we've seen to date. So far, neuroprosthetic technology has only allowed paralyzed users to type out just one letter at a time, a process that can be slow and laborious. It also tapped parts of the brain that control the arm or hand, a system that's not necessarily intuitive for the subject. The USCF system, however, uses an implant that's placed directly on the part of the brain dedicated to speech.

brain signal, speech, vocal tract, (7 more...)

Engadget

Country: North America > United States > California > San Francisco County > San Francisco (0.26)

Industry:

Health & Medicine > Therapeutic Area > Neurology (0.53)
Health & Medicine > Health Care Technology (0.42)

Technology: Information Technology > Artificial Intelligence (0.77)

Add feedback

Mind-reading device uses AI to turn brainwaves into audible speech

New ScientistApr-25-2019, 12:32:45 GMT

Electrodes on the brain have been used to translate brainwaves into words spoken by a computer – which could be useful in the future to help people who have lost the ability to speak. When you speak, your brain sends signals from the motor cortex to the muscles in your jaw, lips and larynx to coordinate their movement and produce a sound. "The brain translates the thoughts of what you want to say into movements of the vocal tract, and that's what we're trying to decode," says Edward Chang at the University of California San Francisco (UCSF). He and his colleagues created a two-step process to decode those thoughts using an array of electrodes surgically placed onto the part of the brain that controls movement, and a computer simulation of a vocal tract to reproduce the sounds of speech. In their study, they worked with five participants who had electrodes on the surface of their motor cortex as a part of their treatment for epilepsy.

artificial intelligence, mind-reading device use ai, speech, (12 more...)

New Scientist

Country: North America > United States > California > San Francisco County > San Francisco (0.56)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.96)

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback